Abstract

The world is struggling with COVID-19 epidemic since December 2019. The virus causes severe respiratory complications, which, can lead to patient death. The number of deaths increase sharply with age. However, the young are also at risk. White blood cells are responsible of human immune system. The main task of whit blood cells is to protect the body against infections and diseases. The aim analysis presented below is to investigate the influence of number or percentage of white blood cells group, including neutrophils and lymphocyte. The author analyzed the blood samples of 375 patients from January 10 to February 18, 2020 from the region of Wuhan, China to identify mortality risk on the basis of different white blood cells types percentage in blood.

Libraries list

## R version 4.0.4 (2021-02-15)
## Platform: i386-w64-mingw32/i386 (32-bit)
## Running under: Windows 10 x64 (build 18363)
## 
## Matrix products: default
## 
## attached base packages:
## [1] stats     graphics  grDevices utils     datasets  methods   base     
## 
## other attached packages:
##  [1] caret_6.0-88    lattice_0.20-41 mlbench_2.1-3   ggsci_2.9      
##  [5] plyr_1.8.6      zoo_1.8-9       dplyr_1.0.5     plotly_4.9.3   
##  [9] ggplot2_3.3.3   tidyr_1.1.3     knitr_1.32      openxlsx_4.2.3 
## 
## loaded via a namespace (and not attached):
##  [1] Rcpp_1.0.6           lubridate_1.7.10     class_7.3-18        
##  [4] assertthat_0.2.1     digest_0.6.27        ipred_0.9-11        
##  [7] foreach_1.5.1        utf8_1.2.1           R6_2.5.0            
## [10] stats4_4.0.4         evaluate_0.14        httr_1.4.2          
## [13] pillar_1.6.0         rlang_0.4.10         lazyeval_0.2.2      
## [16] data.table_1.14.0    jquerylib_0.1.4      rpart_4.1-15        
## [19] Matrix_1.3-2         rmarkdown_2.7        splines_4.0.4       
## [22] gower_0.2.2          stringr_1.4.0        htmlwidgets_1.5.3   
## [25] munsell_0.5.0        compiler_4.0.4       xfun_0.22           
## [28] pkgconfig_2.0.3      htmltools_0.5.1.1    nnet_7.3-15         
## [31] tidyselect_1.1.0     prodlim_2019.11.13   tibble_3.1.0        
## [34] codetools_0.2-18     fansi_0.4.2          viridisLite_0.4.0   
## [37] crayon_1.4.1         withr_2.4.1          ModelMetrics_1.2.2.2
## [40] MASS_7.3-53          recipes_0.1.16       grid_4.0.4          
## [43] nlme_3.1-152         jsonlite_1.7.2       gtable_0.3.0        
## [46] lifecycle_1.0.0      DBI_1.1.1            magrittr_2.0.1      
## [49] pROC_1.17.0.1        scales_1.1.1         zip_2.1.1           
## [52] stringi_1.5.3        reshape2_1.4.4       timeDate_3043.102   
## [55] bslib_0.2.5.1        ellipsis_0.3.1       generics_0.1.0      
## [58] vctrs_0.3.7          lava_1.6.9           iterators_1.0.13    
## [61] tools_4.0.4          glue_1.4.2           purrr_0.3.4         
## [64] survival_3.2-11      yaml_2.2.1           colorspace_2.0-0    
## [67] sass_0.4.0
## Loading required package: data.table
## 
## Attaching package: 'data.table'
## The following objects are masked from 'package:dplyr':
## 
##     between, first, last

Input data summary

Dimensions

x
6120
81
Hypersensitive.cardiac.troponinI hemoglobin Serum.chloride Prothrombin.time procalcitonin eosinophils(%) Interleukin.2.receptor Alkaline.phosphatase albumin basophil(%) Interleukin.10 Total.bilirubin Platelet.count monocytes(%) antithrombin Interleukin.8 indirect.bilirubin Red.blood.cell.distribution.width neutrophils(%) total.protein Quantification.of.Treponema.pallidum.antibodies Prothrombin.activity HBsAg mean.corpuscular.volume hematocrit White.blood.cell.count Tumor.necrosis.factorα mean.corpuscular.hemoglobin.concentration fibrinogen Interleukin.1β Urea lymphocyte.count PH.value Red.blood.cell.count Eosinophil.count Corrected.calcium Serum.potassium glucose neutrophils.count Direct.bilirubin Mean.platelet.volume ferritin RBC.distribution.width.SD Thrombin.time (%)lymphocyte HCV.antibody.quantification D-D.dimer Total.cholesterol aspartate.aminotransferase Uric.acid HCO3- calcium Amino-terminal.brain.natriuretic.peptide.precursor(NT-proBNP) Lactate.dehydrogenase platelet.large.cell.ratio Interleukin.6 Fibrin.degradation.products monocytes.count PLT.distribution.width globulin γ-glutamyl.transpeptidase International.standard.ratio basophil.count(#) 2019-nCoV.nucleic.acid.detection mean.corpuscular.hemoglobin Activation.of.partial.thromboplastin.time High.sensitivity.C-reactive.protein HIV.antibody.quantification serum.sodium thrombocytocrit ESR glutamic-pyruvic.transaminase eGFR creatinine
Min. : 1.9 Min. : 6.4 Min. : 71.50 Min. : 11.50 Min. : 0.020 Min. :0.000 Min. : 61.0 Min. : 17.00 Min. :13.60 Min. :0.00 Min. : 5.00 Min. : 2.50 Min. : -1.0 Min. : 0.300 Min. : 20.00 Min. : 5.000 Min. : 0.100 Min. :10.60 Min. : 1.7 Min. :31.80 Min. : 0.020 Min. : 6.00 Min. : 0.000 Min. : 61.60 Min. :14.50 Min. : 0.13 Min. : 4.00 Min. :286.0 Min. : 0.500 Min. : 5.00 Min. : 0.800 Min. : 0.000 Min. :5.000 Min. : 0.100 Min. :0.000 Min. :1.650 Min. : 2.760 Min. : 1.000 Min. : 0.06 Min. : 1.600 Min. : 8.50 Min. : 17.8 Min. : 31.30 Min. : 13.00 Min. : 0.000 Min. :0.020 Min. : 0.210 Min. :0.100 Min. : 6.00 Min. : 43.0 Min. : 6.30 Min. :1.170 Min. : 5 Min. : 110.0 Min. :11.20 Min. : 1.500 Min. : 4.00 Min. : 0.010 Min. : 8.00 Min. :10.10 Min. : 3.00 Min. : 0.840 Min. :0.000 Min. :-1 Min. :20.4 Min. : 21.80 Min. : 0.10 Min. :0.05 Min. :115.4 Min. :0.010 Min. : 1.00 Min. : 5.00 Min. : 2.00 Min. : 11.00
1st Qu.: 4.4 1st Qu.:113.0 1st Qu.: 99.05 1st Qu.: 13.60 1st Qu.: 0.040 1st Qu.:0.000 1st Qu.: 459.5 1st Qu.: 54.00 1st Qu.:27.40 1st Qu.:0.10 1st Qu.: 5.00 1st Qu.: 7.40 1st Qu.:109.0 1st Qu.: 2.800 1st Qu.: 74.00 1st Qu.: 8.675 1st Qu.: 3.800 1st Qu.:12.00 1st Qu.:65.1 1st Qu.:61.00 1st Qu.: 0.040 1st Qu.: 65.00 1st Qu.: 0.000 1st Qu.: 86.90 1st Qu.:33.50 1st Qu.: 4.94 1st Qu.: 6.70 1st Qu.:333.0 1st Qu.: 3.050 1st Qu.: 5.00 1st Qu.: 4.000 1st Qu.: 0.460 1st Qu.:6.000 1st Qu.: 3.680 1st Qu.:0.000 1st Qu.:2.270 1st Qu.: 3.950 1st Qu.: 5.550 1st Qu.: 3.09 1st Qu.: 3.225 1st Qu.:10.10 1st Qu.: 377.2 1st Qu.: 38.50 1st Qu.: 15.60 1st Qu.: 3.925 1st Qu.:0.040 1st Qu.: 0.603 1st Qu.:3.010 1st Qu.: 19.50 1st Qu.: 183.2 1st Qu.:21.00 1st Qu.:1.980 1st Qu.: 150 1st Qu.: 218.0 1st Qu.:25.60 1st Qu.: 4.772 1st Qu.: 4.00 1st Qu.: 0.270 1st Qu.:11.10 1st Qu.:29.70 1st Qu.: 22.00 1st Qu.: 1.030 1st Qu.:0.010 1st Qu.:-1 1st Qu.:29.7 1st Qu.: 35.30 1st Qu.: 5.70 1st Qu.:0.07 1st Qu.:137.7 1st Qu.:0.150 1st Qu.: 14.00 1st Qu.: 16.00 1st Qu.: 63.58 1st Qu.: 58.00
Median : 20.6 Median :125.0 Median :102.10 Median : 14.80 Median : 0.100 Median :0.100 Median : 676.5 Median : 69.50 Median :32.20 Median :0.20 Median : 5.90 Median : 10.70 Median :178.0 Median : 5.700 Median : 86.00 Median : 16.000 Median : 5.400 Median :12.60 Median :82.4 Median :65.90 Median : 0.050 Median : 81.00 Median : 0.010 Median : 90.10 Median :36.60 Median : 7.72 Median : 8.60 Median :343.0 Median : 4.120 Median : 5.00 Median : 5.985 Median : 0.800 Median :6.500 Median : 4.140 Median :0.010 Median :2.360 Median : 4.410 Median : 6.990 Median : 5.85 Median : 4.800 Median :10.80 Median : 711.0 Median : 40.90 Median : 16.80 Median :11.450 Median :0.060 Median : 2.155 Median :3.630 Median : 27.00 Median : 243.7 Median :23.50 Median :2.080 Median : 585 Median : 340.0 Median :30.90 Median : 19.265 Median : 17.90 Median : 0.410 Median :12.40 Median :32.70 Median : 34.00 Median : 1.140 Median :0.010 Median :-1 Median :30.9 Median : 39.20 Median : 51.50 Median :0.09 Median :140.4 Median :0.210 Median : 28.00 Median : 24.00 Median : 87.90 Median : 76.00
Mean : 1223.2 Mean :123.1 Mean :103.14 Mean : 16.68 Mean : 1.107 Mean :0.629 Mean : 907.2 Mean : 82.47 Mean :32.01 Mean :0.21 Mean : 16.07 Mean : 16.70 Mean :184.3 Mean : 6.155 Mean : 85.32 Mean : 83.088 Mean : 6.889 Mean :13.07 Mean :77.6 Mean :65.30 Mean : 0.132 Mean : 78.55 Mean : 8.306 Mean : 90.39 Mean :36.55 Mean : 15.60 Mean : 11.58 Mean :342.8 Mean : 4.294 Mean : 6.51 Mean : 9.589 Mean : 1.017 Mean :6.484 Mean : 9.288 Mean :0.039 Mean :2.355 Mean : 4.509 Mean : 8.889 Mean : 7.81 Mean : 9.887 Mean :10.91 Mean : 1379.1 Mean : 42.44 Mean : 18.17 Mean :15.392 Mean :0.117 Mean : 7.943 Mean :3.689 Mean : 46.53 Mean : 276.1 Mean :23.14 Mean :2.078 Mean : 3669 Mean : 474.2 Mean :31.77 Mean : 112.308 Mean : 61.35 Mean : 0.526 Mean :13.01 Mean :33.24 Mean : 55.34 Mean : 1.313 Mean :0.017 Mean :-1 Mean :31.0 Mean : 41.52 Mean : 76.24 Mean :0.10 Mean :141.6 Mean :0.212 Mean : 33.69 Mean : 38.86 Mean : 81.56 Mean : 109.93
3rd Qu.: 223.8 3rd Qu.:137.0 3rd Qu.:105.65 3rd Qu.: 16.70 3rd Qu.: 0.405 3rd Qu.:0.800 3rd Qu.:1155.5 3rd Qu.: 95.00 3rd Qu.:36.60 3rd Qu.:0.30 3rd Qu.: 12.35 3rd Qu.: 16.77 3rd Qu.:248.0 3rd Qu.: 8.600 3rd Qu.: 97.00 3rd Qu.: 35.200 3rd Qu.: 8.000 3rd Qu.:13.70 3rd Qu.:92.3 3rd Qu.:70.45 3rd Qu.: 0.070 3rd Qu.: 95.00 3rd Qu.: 0.010 3rd Qu.: 93.90 3rd Qu.:39.90 3rd Qu.: 12.72 3rd Qu.: 11.50 3rd Qu.:350.0 3rd Qu.: 5.480 3rd Qu.: 5.00 3rd Qu.:11.400 3rd Qu.: 1.310 3rd Qu.:7.294 3rd Qu.: 4.650 3rd Qu.:0.060 3rd Qu.:2.440 3rd Qu.: 4.870 3rd Qu.:10.260 3rd Qu.:10.95 3rd Qu.: 8.275 3rd Qu.:11.50 3rd Qu.: 1425.2 3rd Qu.: 44.70 3rd Qu.: 18.38 3rd Qu.:24.975 3rd Qu.:0.090 3rd Qu.:21.000 3rd Qu.:4.265 3rd Qu.: 42.00 3rd Qu.: 333.8 3rd Qu.:25.90 3rd Qu.:2.190 3rd Qu.: 2625 3rd Qu.: 601.8 3rd Qu.:37.20 3rd Qu.: 60.167 3rd Qu.:150.00 3rd Qu.: 0.580 3rd Qu.:14.30 3rd Qu.:36.50 3rd Qu.: 58.00 3rd Qu.: 1.330 3rd Qu.:0.020 3rd Qu.:-1 3rd Qu.:32.2 3rd Qu.: 44.12 3rd Qu.:118.50 3rd Qu.:0.11 3rd Qu.:143.5 3rd Qu.:0.270 3rd Qu.: 45.50 3rd Qu.: 41.00 3rd Qu.:103.97 3rd Qu.: 98.25
Max. :50000.0 Max. :178.0 Max. :140.40 Max. :120.00 Max. :57.170 Max. :8.600 Max. :7500.0 Max. :620.00 Max. :48.60 Max. :1.70 Max. :1000.00 Max. :505.70 Max. :558.0 Max. :53.000 Max. :136.00 Max. :6795.000 Max. :145.100 Max. :27.10 Max. :98.9 Max. :88.70 Max. :11.950 Max. :142.00 Max. :250.000 Max. :118.90 Max. :52.30 Max. :1726.60 Max. :168.00 Max. :514.0 Max. :10.780 Max. :88.50 Max. :68.400 Max. :52.420 Max. :7.565 Max. :749.500 Max. :0.490 Max. :2.790 Max. :12.800 Max. :43.010 Max. :33.88 Max. :360.600 Max. :15.00 Max. :50000.0 Max. :113.30 Max. :161.90 Max. :60.000 Max. :2.090 Max. :60.000 Max. :7.300 Max. :1858.00 Max. :1176.0 Max. :36.30 Max. :2.620 Max. :70000 Max. :1867.0 Max. :62.20 Max. :5000.000 Max. :190.80 Max. :39.920 Max. :25.30 Max. :50.60 Max. :732.00 Max. :13.480 Max. :0.120 Max. :-1 Max. :50.8 Max. :144.00 Max. :320.00 Max. :0.27 Max. :179.7 Max. :0.510 Max. :110.00 Max. :1600.00 Max. :224.00 Max. :1497.00
NA’s :5613 NA’s :5145 NA’s :5145 NA’s :5458 NA’s :5661 NA’s :5163 NA’s :5852 NA’s :5190 NA’s :5186 NA’s :5163 NA’s :5853 NA’s :5190 NA’s :5163 NA’s :5162 NA’s :5790 NA’s :5852 NA’s :5214 NA’s :5197 NA’s :5163 NA’s :5189 NA’s :5841 NA’s :5461 NA’s :5841 NA’s :5163 NA’s :5163 NA’s :4993 NA’s :5852 NA’s :5163 NA’s :5554 NA’s :5852 NA’s :5184 NA’s :5163 NA’s :5736 NA’s :4993 NA’s :5163 NA’s :5206 NA’s :5140 NA’s :5345 NA’s :5163 NA’s :5190 NA’s :5258 NA’s :5837 NA’s :5197 NA’s :5554 NA’s :5162 NA’s :5841 NA’s :5490 NA’s :5189 NA’s :5185 NA’s :5186 NA’s :5186 NA’s :5141 NA’s :5645 NA’s :5186 NA’s :5258 NA’s :5848 NA’s :5790 NA’s :5163 NA’s :5258 NA’s :5190 NA’s :5190 NA’s :5461 NA’s :5163 NA’s :5619 NA’s :5163 NA’s :5552 NA’s :5383 NA’s :5842 NA’s :5145 NA’s :5258 NA’s :5737 NA’s :5189 NA’s :5184 NA’s :5184

Parameters list:

##  [1] "PATIENT_ID"                                                   
##  [2] "RE_DATE"                                                      
##  [3] "age"                                                          
##  [4] "gender"                                                       
##  [5] "Admission.time"                                               
##  [6] "Discharge.time"                                               
##  [7] "outcome"                                                      
##  [8] "Hypersensitive.cardiac.troponinI"                             
##  [9] "hemoglobin"                                                   
## [10] "Serum.chloride"                                               
## [11] "Prothrombin.time"                                             
## [12] "procalcitonin"                                                
## [13] "eosinophils(%)"                                               
## [14] "Interleukin.2.receptor"                                       
## [15] "Alkaline.phosphatase"                                         
## [16] "albumin"                                                      
## [17] "basophil(%)"                                                  
## [18] "Interleukin.10"                                               
## [19] "Total.bilirubin"                                              
## [20] "Platelet.count"                                               
## [21] "monocytes(%)"                                                 
## [22] "antithrombin"                                                 
## [23] "Interleukin.8"                                                
## [24] "indirect.bilirubin"                                           
## [25] "Red.blood.cell.distribution.width"                            
## [26] "neutrophils(%)"                                               
## [27] "total.protein"                                                
## [28] "Quantification.of.Treponema.pallidum.antibodies"              
## [29] "Prothrombin.activity"                                         
## [30] "HBsAg"                                                        
## [31] "mean.corpuscular.volume"                                      
## [32] "hematocrit"                                                   
## [33] "White.blood.cell.count"                                       
## [34] "Tumor.necrosis.factor<U+03B1>"                                
## [35] "mean.corpuscular.hemoglobin.concentration"                    
## [36] "fibrinogen"                                                   
## [37] "Interleukin.1ß"                                               
## [38] "Urea"                                                         
## [39] "lymphocyte.count"                                             
## [40] "PH.value"                                                     
## [41] "Red.blood.cell.count"                                         
## [42] "Eosinophil.count"                                             
## [43] "Corrected.calcium"                                            
## [44] "Serum.potassium"                                              
## [45] "glucose"                                                      
## [46] "neutrophils.count"                                            
## [47] "Direct.bilirubin"                                             
## [48] "Mean.platelet.volume"                                         
## [49] "ferritin"                                                     
## [50] "RBC.distribution.width.SD"                                    
## [51] "Thrombin.time"                                                
## [52] "(%)lymphocyte"                                                
## [53] "HCV.antibody.quantification"                                  
## [54] "D-D.dimer"                                                    
## [55] "Total.cholesterol"                                            
## [56] "aspartate.aminotransferase"                                   
## [57] "Uric.acid"                                                    
## [58] "HCO3-"                                                        
## [59] "calcium"                                                      
## [60] "Amino-terminal.brain.natriuretic.peptide.precursor(NT-proBNP)"
## [61] "Lactate.dehydrogenase"                                        
## [62] "platelet.large.cell.ratio"                                    
## [63] "Interleukin.6"                                                
## [64] "Fibrin.degradation.products"                                  
## [65] "monocytes.count"                                              
## [66] "PLT.distribution.width"                                       
## [67] "globulin"                                                     
## [68] "<U+03B3>-glutamyl.transpeptidase"                             
## [69] "International.standard.ratio"                                 
## [70] "basophil.count(#)"                                            
## [71] "2019-nCoV.nucleic.acid.detection"                             
## [72] "mean.corpuscular.hemoglobin"                                  
## [73] "Activation.of.partial.thromboplastin.time"                    
## [74] "High.sensitivity.C-reactive.protein"                          
## [75] "HIV.antibody.quantification"                                  
## [76] "serum.sodium"                                                 
## [77] "thrombocytocrit"                                              
## [78] "ESR"                                                          
## [79] "glutamic-pyruvic.transaminase"                                
## [80] "eGFR"                                                         
## [81] "creatinine"

Atributes analysis

Average patients age: 59

The oldest patient: 95

The youngest patient: 18

Number of survived : 201

Number of deaths: 174

**

Died Survived
Female 48 103
Male 126 98

**

Patients information

gender averageage
Male Female 55.17881
Female Male 61.28571

Blood samples analysis

Notice: Normal percentage is between 10-45%

Notiice: Neutrophils - the most numerous group of white blood cells of the immune system. The task of neutrophils is to protect the body against infections and diseases. The norm for neutrophils is 60-70% of all white blood cells

Male and female, dead and survived in function of age

Data coralations - Pearson method

## Creating predictive models with caret

Data was divided into training and testing where p=0.75

## [1] "Died"     "Survived"

Notice: Due to the NA values in blood samples it was not possible to apply any caret model (e.g. RF, parRF, bayesglm)

Summary

The main purpose of the analysis was to investigate the white blood composition of death and survived patients. To analyse the mortality of individual patients three parameters were taken into account: number of white cells in blood, percentage of lymphocytes and percentage of neutrophils. The blood tests results were presented in different graphs. The analysis indicate that for patients who died these three parameters were off the scale. The author tried to create an predictive model to forecasts the mortality of patients. Due to the incomplete blood tests the author was not able to apply any model from Caret library. The author tried to apply models like RF, parRF, bayesglm. However the data was divided into training and testing dataset and the regression chart was made.

Bibliography

Yan, Hai-Tao Zhang, ,,An interpretable mortality prediction model for COVID-19 patients" (https://www.nature.com/articles/s42256-020-0180-7)